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Abstract 

We study the feasibility of a precise measurement of the mass of a 120 GeV 
MSM Higgs boson through direct reconstruction of ZH qqH events that would 
be achieved in a future e+e~ linear collider operating at a center-of-mass energy of 
500 GeV. Much effort has been put in a "realistic simulation" by including irreducible 
and reducible backgrounds, realistic detector effects and reconstruction procedures 
and sophisticated analysis tools involving Neural Networks and kinematical fitting. 
As a result, the Higgs mass is determined with a statistical accuracy of 50 MeV and 
the Z-Higgs Yukawa coupling measured to 0.7%, assuming 500 fb~^ of integrated 
luminosity. 

Results presented at the International Workshop on Linear Colliders (LCWS99) 
Sitges, Barcelona, Spain, 28 April - 5 May 1999 
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1 Introduction 



The Standard Model [Q] (MSM) of electroweak and hadronic interactions has been suc- 
cessfully tested so far to an extremely high degree of accuracy. However, one of its key 
elements, the Higgs mechanism, remains to be tested experimentally. It is through the 
interaction with the ground state Higgs field that the fundamental particles acquire mass, 
which in turn sets the scale of the coupling with the Higgs boson. Once the Higgs bo- 
son is found, all its properties have to be accurately measured. In the MSM, the Higgs 
mass is not predicted by the theory but the profile of the Higgs particle: decay width, 
branching ratios and production cross-section, are uniquely determined once is fixed, 
hence the importance of performing a precise measurement of the Higgs mass. In order 
to establish experimentally that this particle has indeed the properties of a Higgs boson, 
we need to prove that it is a scalar particle and that it arises from a field with a vacuum 
expectation value which contributes to the W and Z masses |^. The latter is achieved by 
determining the Higgs Yukawa couplings to the Z and W gauge bosons, which in an e^e~ 
collider can be measured from the Higgstrahlung process: Z* ^ ZH ^ and the fusion 
processes: W*W*, Z*Z* H The scalar nature of this particle can be verified from 
its production angular distribution. 

In this work we have assumed that the MSM Higgs boson in the "Intermediate Mass 
Region" {Mz < Mh < 2Myy) will have already been discovered at the present (LEP2, 
TeVatron) or future (LHC, NLC) accelerators, and study the feasibility of a precise mea- 
surement of its mass and Yukawa coupling to the Z boson from the Higgstrahlung process. 

This mass range is favored both experimentally: from global fits to electroweak pre- 
cision observables at LEP, SLC and TeVatron the upper limit Mh < 260 GeV at 95% 
CL II is derived, whereas from direct search at LEP2 Mh > 95.2 GeV at 95% CL Q; and 
theoretically: the stability and triviality bounds constrain the MSM Higgs boson mass to 
be in the range 130 GeV < Mh < 180 GeV 0. A Higgs boson in the "Intermediate Mass 
Region" would be relatively more difficult to detect at LHC than a heavy Higgs: whereas 
a Higgs boson in the mass range 150-700 GeV can be found straightforwardly in the decay 
H ZZ 4£, the discovery of the light MSM Higgs boson would be based on — > 77 
and would require 100 fb-i of data (a year's running at the design luminosity). Instead, 
a light Higgs boson might be discovered with less than 1 fb^^ of integrated luminosity at 
a future e~^e~ linear collider operating at ^/s = 300 GeV |^]. However, the job of a linear 
e^e~ collider would be rather to study in great detail the properties of the Higgs particle, 
which can be uniquely be attained in the clean and very high luminosity (/ Cdt > 100 
fb^^/year) environment expected. 

As already mentioned, we will focus on the case of the MSM Higgs boson, which is 
equivalent, for Mh < 130 GeV, to the case of the light h MSSM Higgs boson close to the 
decoupling regime (where the MSM and MSSM Higgs sectors look practically the same). 
For the sake of definiteness, we will assume Mh = 120 GeV and concentrate on the Z 
hadronic decay mode: e~^e~ — > ZH — > qqH. For Mh = 120 GeV, the Higgs decays dom- 
inantly to bb (BR(i/ — > bb) ~ 72%), which leads to multi-jet event topologies involving 
at least 2 6-jets in the final state. Therefore, one of the crucial experimental aspects will 
be flavor tagging. Most of previous studies have been focussed on the leptonic Z decay 
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mode since it is less affected by background and there is a better intrinsic resolution on 
Mh (through the recoil mass distribution). For instance, in |15] a statistical uncertainty 
of {AMH)stat ~ 110 MeV is obtained from the combination of ZH — > e~^e~H, fi^ij,~H 
channels at ^/s = 350 GeV and assuming 500 fb~^ of integrated luminosity .The disadvan- 
tage is the lack of statistics {BR{Z I'^i^) ~ 10%) as compared to the hadronic decay 
channel {BR{Z qq) ~ 70%). Therefore, it is fully justified to investigate to what extent 
it is possible to make of the hadronic decay channel a competitive measurement. 



2 Experimental Strategy 

As already mentioned, the main production mechanisms of the MSM Higgs boson in e^e~ 
annihilation are the Higgstrahlung process: e"^e~ — > Z* — > ZH (with a cross-section scal- 
ing as 1/s and therefore dominating at low energy) and H^M/^-fusion: e"*"e~ — > vi>W*W* — > 
vvH (which dominates at high energies since the cross-section scales as log(s/M^) ). One 
order of magnitude smaller than l/^/^VF-fusion there is also the contribution from ZZ-fusion: 
e+e" ^ e+e-Z*Z* e+e-R. At ^ = 500 GeV and for 100 GeV < Mh < 200 GeV, 
the Higgstrahlung and VFM^-fusion have approximately the same cross-section. 

The lowest order total cross-section for ZH, assuming Mh = 120 GeV, is shown in 
Fig. H as a function of the center-of-mass energy . The effect of radiative processes in the 
initial state (initial state radiation|^ and beamstrahlung) on the total cross-section is also 
illustrated. The total cross-section at -y/i = 500 GeV is about 66 fb, which represents an 
event sample of about 6600 events/year assuming^ C = lO^'* cm~^s~^. 

The measurement is performed at ^/s = 500 GeV. In principle, one would prefer to sit 
at the peak of azH '■ 

that is {\fs)peak ~ 260 GeV for Mh = 120 GeV, since the cross-section is enhanced 
by a factor ~ 3.4 with respect to -^/i = 500 GeV and the background from ti is not 
present. However, it is unclear how feasible it would be to collect enough integrated 
luminosity at such a low center-of-mass energy in order to perform a precise measurement. 
Therefore, a more realistic strategy could be to perform a first direct measurement of Mh 
at the ti threshold {^/s = 350 GeV, where still cjzh(350 GeV) ~ 2.1 (Tzh(500 GeV)) 
and contemporary with the top threshold measurements, and then perform the precise 
measurement at > 500 GeV, where most of the integrated luminosity will be collected. 
There are two main strategies for the Higgs mass measurement: 

• calculation of the mass recoiling against the Z: this has the nice feature of being 
independent on assumptions about the Higgs decay modes |15|, and would show the 



Higgs resonance even for invisible Higgs decays. The recoil mass is computed from 
the reconstructed Z 4-momentum assuming the nominal center-of-mass energy: 

mI = s- l^fsEz + M|, (2.1) 



^Initial state radiation will be used hereafter as synonymous with bremsstrahlung. 
^It has been assumed that 1 year's running — 10^ s. 
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where Ez and Mz are, respectively, the Z reconstructed energy and invariant mass. 
Therefore, this method is more suitable for the Z leptonic decay channel since a 
more precise Z reconstruction is possible. 

• direct reconstruction of the invariant mass of the Higgs decay products. As will be 
shown, this method works better for hadronic Z decays than the recoil mass method. 

The total cross-section depends sensitively on the Z-Higgs Yukawa coupling, which 
can thus be inferred from the comparison of the measured total cross-section with the 
theoretical expectation as a function of gzzH- Therefore it is possible to verify whether 
this field is the responsible for the whole Z mass (as in the MSM), or only for a fraction 
of it. 



3 Simulation Aspects 

The signal and the different backgrounds have been generated with PYTHIA The top 
quark and Higgs masses have been assumed to be mt = 175 GeV and Mh = 120 GeV, 
respectively. Interference between signal and background have been neglected. The event 
samples have been generated at \/s = 500 GeV, including initial state radiation (ISR) and 
beamstrahlung. For efficiency reasons, a generation cut \fs' >100 GeV has been applied. 
Initial state radiation has been considered in the structure function approach and beam- 
strahlung has been generated with the aid of the CIRCE program |l^. Fragmentation, 
hadronization and particles' decays are handled by JETSET p, with parameters tuned 
to LEP2 data. 



3.1 Detector Simulation 

Once the events have been generated, they are processed through a fast simulation [0] 
of the response of a detector for a future linear collider. The detector components, which 
are assumed to be: 

• a vertex detector, 

• a tracker system with main tracker (2 m. radius TPC embedded in a 2 Tesla magnetic 
field), forward tracker and forward muon tracker, 

• an electromagnetic calorimeter, 

• a hadronic calorimeter and 

• a luminosity detector, 



are implemented according to the Large Detector model in [12|. 

This fast detector simulation provides a flexible tool since its performance charac- 
teristics can be varied within a wide range. The calorimeter response is treated in a 
realistic way using a parametrization of the electromagnetic and hadronic shower deposits 



obtained from a full GEANT simulation |13] and including a cluster flnding algorithm 
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Pattern recognition is emulated by means of a complete cross-reference table between 
generated particles and detector response. The output of the program consists of a list 
of reconstructed objects: electrons, gammas, muons, charged and neutral hadrons and 
unresolved calorimeter clusters, as a result of an idealized Energy Flow (EF) algorithm 
incorporating track-cluster matching. 

3.2 B-tagging 

Jets coming from b and c-quark decays are tagged based on the non-zero lifetime of these 
quarks, using the Vertex Detector (VDET). In this study we have assumed the performance 
of a CCD VDET in a 1 cm radius beampipe. 

In order to look for this lifetime signal, we have chosen to use the 3D impact parameter 
(IP) of each charged track (distance of closest approach between the track and the b 
production point). Since the statistical resolution of the IP varies strongly from one track 
to another, we use the estimated statistical significance of the measured IP to define our 
tag. The 6-tagging algorithm is kept simple so that the success of the analysis does not 
depend on detector details. More efficient algorithms can be developed by making use of 
multivariate techniques, such as Neural Networks. 

In Fig. ^ the IP significance distributions for different Z hadronic decays are compared. 
The lifetime signature can be clearly seen for Z ^ bb in the positive tail. We will use 
the IP distribution for prompt tracks (those originated from Z — > uu, dd, ss) to define, for 
each track, a probability "to be consistent with originating from the primary vertex" . This 
information can then be combined to get a probability per jet or for the whole event |p^]. 

In order to test the performance of such 6-tagging, we have estimated its efficiency 
and purity for a given definition of 6-jet. To do so, the Monte Carlo generated quarks 
are assigned to the reconstructed jets by a matching algorithm which associates those 
quark-jet pairs with minimum invariant mass, starting from the most energetic quark. In 
order not to reduce drastically the signal efficiency, we will not use the number of found 
6-jets for a certain lifetime probability as a selection cut. Instead, for every event, we 
will define as 6-jets those two with the lowest probability (to originate from the primary 
vertex). Applied to ZH — > qqbb (with q = u,d,s,c,b) events, this algorithm would tag 
the two correct H 6-jets in ~ 43% of the cases, and at least one of them in ~ 93% of the 
cases. 

4 Experimental Analysis 

The experimental analysis is performed assuming a total integrated luminosity of 10 fb^^, 
which can be collected in around 11 days of running at £ = 10^^ cm^^s"^. 

In spite of the apparently clean signature of this decay channel (4 jets in the final 
state, out of which > 2 are 6-jets, di-jet invariant mass constraint for the Z decay, etc), 
the measurement has many difficulties, among which are: 

• the tiny signal {azH^qqH ~ 46.2 fb) with backgrounds about 300 times larger: in 
Table |, the total cross-sections for the signal and different backgrounds considered 
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are listed together with the numbers of generated events; 

• hmitations of jet-clustering algorithms in properly reconstructing 4 jets in the final 
state due to hard gluon radiation, jet-mixing, etc; 

• degradation of 6-tagging performance due to hard gluon radiation; 

• missing energy in b jets from neutrino emission. For instance, the branching ratio 
for semileptonic-|-leptonic decay modes of the is ~ 52.4%. 



Process 


a{fb) 


Generated events 


ZH qqH 


46.2 


100k 


ZH e+£-H 


6.7 


100k 


gg (5 flavors) 


3860.4 


IM 


tt 


582.0 


IM 


w+w- 


7821.4 


2.2M 


zz 


570.0 


IM 



Table 1: Total cross-section for signal and the different backgrounds considered at ^/s = 500 GeV. 
Initial state radiation and beamstrahlung have been included. For efficiency reasons, a generation cut of 
v's' > 100 GeV has been applied. Also listed is the number of generated events for every process. 

Due to the very small signal-to-background (S/B) ratio, the philosophy of the analysis 
will be to start by applying a standard-cuts preselection in order to remove as much 
background as possible while keeping a high efficiency for the signal. Then, in order 
to further improve the statistical sensitivity to the signal, a multivariate analysis will be 
performed. At this stage our problem will be how to make an optimal use of the statistical 
information from a set of N distributions discriminating between signal and background. 
It can proven |jl6[ that it is possible to make an optimal projection from the input A^-D 
space to a 1-D spac^: 

a) without loss of sensitivity on the classes proportions and 

b) with a probabilistic interpretation (in terms of the a-posteriori Bayesian probability 

of being of signal type). 

This projection can be performed by using Neural Network (NN) techniques, which have 
become increasingly popular in High Energy Physics in the last few years. 

^In the general case of m existing classes to be discriminated, the optimal projection is performed in a 
space (m — l)-dimensional. In our problem, all backgrounds are considered inclusively and m = 2, thus 
the optimal projection is 1-dimensional. 
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4.1 Event Selection 

As already mentioned, a standard cuts preselection is applied in order to remove as much 
background as possible before the multivariate analysis. The selected events are required 
to have a visible mass in excess of 0.6y^ (i.e. 300 GeV), more than 40 EF objects recon- 
structed, at least 4 jets reconstructed with the JADE |jl^ jet-clustering algorithm with 
a resolution parameter i/cut = 4 x 10~^ and a thrust value ranging in between 0.85 and 
0.925. Next, the event is forced to have exactly 4 jets reconstructed using the JADE 
algorithm. Further preselection cuts require a minimum of 2 charged tracks per jet and 
a minimum di-jet invariant mass of 40 GeV. The preselection variables are compared for 
signal and background in Fig. ^ along with the cuts applied. The preselection efficiencies 
and effective cross-sections for the different processes considered are listed in Table |2[ Af- 
ter preselection, the efficiency for signal is reduced to 67.3% and the sample purity is only 
~ 4.0%. 



Process 


e(%) 


o-cff (fb) 


ZH qqH 


67.27 


31.08 


ZH i+i'H 


1.48 


0.10 


qq {5 flavors) 


6.76 


290.96 


tt 


4.26 


24.79 


w+w- 


5.00 


391.07 


zz 


12.30 


70.11 


Total Bckg 




747.03 



Table 2: Hadronic channel preselection efficiencies and effective cross-sections. 

As it can observed in Fig. the preselection variables after cuts still have discriminant 
power between signal and background. In order to optimally use these variables, they are 
further used together with three more variables to train a Preselection NN. These three 
variables (shown in Figs, ^a, |5|b and ^) provide information about the lifetime content 
of the event: the logarithm of the event probability to contain no-lifetime, the difference 
between the probability of the second jet and the first jet (sorted from the most 6-like 
to the least 6-like) and the number of 6-jets found (where a b-jet is defined as a jet with 
a no-lifetime probability smaller than 13.5%). In Fig. ^ it is shown the Preselection 
NN output, after training, for both signal and background. No cut is applied in this 
distribution, but it is rather used as a discriminant variable. 

There are 12 more variables which are discriminant between signal and background 
(see Figs. ^ and Most of them are variables about the global event topology: 

• Evis: total visible energy of the event; 

• Max(Ejet)-Min(Ejet): difference between maximum and minimum jet energy; 

• Njets(LUCLUS, 4«t=20 GeV): number of jets found with the LUCLUS 0] jet- 
clustering algorithm for a distance measure of 20 GeV; 
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• low jet mass of the event. The event is divided in two hemispheres and particles 
are assigned to either hemisphere in order to minimize the quadratic sum of the two 
hemispheres invariant mass (hereafter called respectively high and low jet masses). 
For processes with two resonances (such as ZZ or ZH), these distributions 
tend to show resonant structures around the true invariant masses. 

• minimum di-jet angle; 

• cosine of the polar angle of the thrust axis; 

• normalized Fox- Wolfram moments h^Q and /140; 

• aplanarity, 

• number of hard leptons {E > 50 GeV) found; 

others contain information about flavor tagging (sum of the no-lifetime probability for the 
two most 6- like jets) or Z invariant mass reconstruction (see Sect. [4.2| ). 

These variables, together with the Preselection NN output (PreselNNO) distribution 
are used to train a Selection NN. Table |^ shows the discriminant power of each of the 
13 variables used in the Selection NN. The Selection NN output (SelNNO) distribution 
is compared for signal and background in Fig. In Fig. ||b, the signal efficiency (e) 
and purity (p) as a function of the cut in the SelNNO are shown. Among the different 
backgrounds, the main contribution in the "signal region" (e.g. SelNNO> 0.85) comes 
from qq (5 flavors), followed by ZZ, as shown in Fig. |9[ 



Variable 


Discriminant ] 


max(EJ^*) - min(EJ'=*) 


8.8 


min(6'ij) 


7.6 


Njets(LUCLUS) 


5.2 


pjetl , pjet2 
btagOrd btagOrd 


4.1 


cos(0t) 


6.0 


I13O 


11.0 


I140 


9.1 


Low jet mass 


8.2 




9.8 


Number of hard leptons 


4.3 




9.9 


Aplanarity 


7.2 


PreselNNO 


8.7 



(%) 



Table 3: Discriminant power of each of the 12 input variables of the Selection NN. 
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The selection can be performed in such a way as to minimize the statistical uncertainty 
in the cross-section measurement and thus, in the Z-Higgs Yukawa coupling: 

r AgzzH \ _ 1 _ 1 1 1 

V 9ZZH J stat 2V CrzH J stat 2 ^GzH^qqH \/TpL' 

where L is the integrated luminosity. This cut would correspond to the maximum of ep 
shown in Fig. |8|b, [ep)max — 0.23, which translates into: 

V f^ZH J stat V 9ZZH J stat 

assuming 10 fb^^ of integrated luminosity^. This optimal cut, SelNNO>0.85, leads to a 
signal efficiency of 38.0% and a purity of 56.2%. The selection efficiencies and effective 
cross-sections for the different backgrounds are given in Table ^, corresponding to the 
above cut. However, in order to determine the Higgs mass, a higher sample purity is in 
general desirable, which can be obtained by performing a tighter cut on SelNNO. 



Process 




o-eflf (fb) 


# Events (L=10 fb^^) 


ZH qqH 


38.0 


17.56 


176 


ZH e+frH 


~ 


~ 


~ 


qq (5 flavors) 


1.30 X 10-^ 


5.02 


50 


tt 


3.92 X 10^^ 


2.28 


23 


w+w- 


3.40 X 10-2 


2.66 


27 


zz 


6.50 X 10-1 


3.70 


37 


Total Bckg 




13.66 


137 



Table 4: Selection efficiencies and effective cross-sections for SelNNO>0.85. 



4.2 Higgs mass reconstruction 

Once the events have been selected, the Higgs invariant mass has to be reconstructed. In 
this analysis we are not going to be exclusive in the reconstruction of the different Higgs 
decay channels, but rather try to reconstruct always 4 jets in the final state. Indeed, this 
leads to some inefficiency (in particular for H decay modes such as — > W~^W~ ,t^t~), 
but since BR{H hh + cc. + gg) ~ 80%, it is fully justified for the purpose of this study. 

For four reconstructed jets in the final state there are 6 possible di-jet assignments. At 
this point we do not use the 6-tagging information in order to identify the 6-jets coming 
from the Higgs, nor any assumption about the Higgs mass. Instead, the combination 
which maximizes V{mij — mz), where rriij is the invariant mass between jets i and j, and 
V is the probability density function of the Z invariant mass for the correct jet pairing, 

*A better statistical uncertainty could in principle be obtained from a likelihood fit to the SelNNO 
distribution. 
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is selected. The efficiency to tag the correct combination is 68% before preselection and 
goes up to 85% for the finally selected events. Therefore, the combinatorial background 
in the final Higgs invariant mass distribution is ~ 15%. 

As already mentioned, the raw recoil mass distribution from the reconstructed Z jets 
is not really suited for a precise Mji measurement in the hadronic channel, even assuming 
perfect knowledge of the event-by-event effective center-of-mass energy. Instead, the raw 
di-jet invariant mass distribution from the H jets shows a much sharper peak around 120 
GeV, as shown in Fig. The asymmetry and low mass tail for the the raw H di-jet 
invariant mass distribution is caused by energy losses in the H decays (dominated by 
neutrino emission in h decays from K bb, but also receiving a small contribution from 
other H decay modes such as t~^t~ or VF"'"Vl^~). This is demonstrated in Fig. |ll|, where 
the amount of energy lost in the form of neutrinos by each H jet has been computed at 
the hadron level from the MC. The fraction of energy lost is defined for each jet with 
respect to the true H daughter's energy. Then, the sum of both fractions is required to 
be below 2% for an event to be considered with no missing energy in the H decay. 

In order to improve the Higgs invariant mass resolution, a kinematical fit (KF) im- 
posing energy and momentum conservation is performed. In an event-by-event basis, the 
whole event kinematics (represented in Fig. ^) is fitted: 

• di-jet invariant masses: Mz and Mh', 

• production angles of the Z boson in the e~^e~ rest frame: 6z and cpz', 

• production angles of one of the Z jets with respect to the Z direction in the Z rest 
frame: 9* and (j)*; 

• production angles of one of the H jets with respect to the H direction in the H rest 
frame: 9^ and 0^. 

The jet masses have been fixed to the reconstructed values. It order to properly correct 
the jet energies and angles, we have included in the kinematical fit the non-gaussian 
probability density functions (PDFs): 

f{E,-E,\Eg), f{9g-9j\Eg), f{^g-(t>j\Eg), 

where -EgQ), ^^q) and (l)q(j) are respectively the energy and angles of the quarks(jets) in the 
laboratory frame. This is particularly important for 6-jets, as can be observed in Fig. 
Therefore, the above PDFs have been parametrized separately for light-quark and 6-jets. 

In Fig. |l^, the jet energy resolution is compared for light-quark jets from the Z and 
"6-jets"0 from the H in 3 different quark energy ranges. The contribution from H — > 
W~^W~ ,T~^T~ has also been explicited. As it can be observed, at low parent quark energy 
the jet energy resolution distribution is very similar for both Z light-quark and H 6-jets 
and shows a negative tail because of jet-mixing with the other jet from the same boson 
decay, whereas the mixing between jets belonging to different boson decay is negligible. 

^As already mentioned, we have been inclusive in the treatment of H decay modes other than bb and 
they are included in the histograms. 
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The main reason is the large boost of the Z and i?, which reduces the angular separation 
between the decay products belonging to the same boson: 

< ^ieteH(z)jeteH(z) > ^ 70°(56°), 

< min{9jetez,jet€H) > ^ 120°. 

As the parent quark energy increases, the difference between the jet energy resolution 
for light-quark and 6-jets becomes more evident, the latter developing a larger positive 
tail. The reconstructed jet energy is lower than the quark energy because of losses in 
neutrino emission, which are not however larger for jets coming from lower energy quarks, 
but which become more evident at high energy because of the better performance of 



the jet-clustering algorithm in properly reconstructing the jet. Fig. |lj shows the effect 
of jet-mixing and missing energy on the bi-dimensional distribution of energy resolution 
for both H jets (only H — > bb,cc,gg decay channels have been included). As expected, 
jet energy losses are uncorrelated for both H jets, whereas jet-mixing introduces a clear 
anticorrelation. The jet-mixing is computed at the hadron level and defined as the fraction 
of the reconstructed jet energy coming for the other quark in the same boson decay. 

In order to further improve the H invariant mass resolution, it is necessary to properly 
account for initial state radiation and beamstrahlung in the kinematical fit. As it can 



be observed in Fig. 15, it constitutes the second largest source of degradation in the H 
di-jet invariant mass resolution (the first one being energy losses in 6-jet decays). In order 
to account for the event-by-event fluctuations in the effective center-of-mass energy and 
boost along the 2;-direction, the fraction of energy carried by the e~ and the e"*": xi and 
X2, after ISR and beamstrahlung is fitted by including in the likelihood the ISR structure 
functions for both the e~ and e"*". Indeed, it would be best to include the effective struc- 
ture functions (after selection) accounting for ISR and beamstrahlung. However, even this 



simple approach gives good results, as it is shown in Fig. 16, where a clear correlation 
between the true and fitted total longitudinal momentum and the true and fitted effective 
center-of-mass energy is observed. Fig. ^ illustrates the overall improvement in the H 
invariant mass resolution by performing a kinematical fit with respect to the raw recon- 
structed di-jet invariant mass, as well as the further gain obtained by including ISR and 
beamstrahlung in the kinematical fit. 



4.3 Higgs mass determination 

The Higgs mass is determined from a likelihood fit to the H invariant mass distribution 
resulting from the kinematical fit (M^^). The data sample corresponds to f Cdt = 10 
fb~^ and includes background (see Fig. p!8|b). The Higgs mass estimator for a particular 
data sample containing N^ata events, Mh-, is obtained by maximizing the log-likelihood 
function: 

C{Mh) = -2log L{Mh) = -2 5^ log {pVs{M^^ \ Mh) + (1 - p) Vb{M§^)} (4.1) 

where p is the expected signal purity and VsiB) is the signal (background) PDF. The fit 
is performed in the range 115 GeV < M^^ < 125 GeV, where most of the sensitivity to 
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Mh exists and the signal PDF is to a good approximation a truncated Breit-Wigner dis- 
tribution with ~ 2.2 GeV width. Apart from the cut in the H invariant mass distribution, 
a selection cut SelNNO>0.9 has been applied, leading to a signal efficiency of 16.5% and 
a purity of 82.1%. The expected numbers of signal and background events in the data 
sample are 76.2 and 13.6, respectively. 

A number of MC experiments is performed and the expected error in the Higgs mass 
is computed as the RMS of the distribution of Mh estimators. The Mh estimator used 
has been checked to be unbiased by comparing the mean of the distribution of estimators 
with the input Higgs mass. The resulting statistical uncertainty on Mh is: 

{/^MH)stat ^ 350 MeV, 

corresponding to j Cdt = 10 fb^^. The effect of the background contamination is a 
~ 7% degradation in the statistical uncertainty. The effect of taking into account ISR 
and beamstrahlung in the kinematical fit has been a ~ 20% decrease in the statistical 
uncertainty. 

5 Conclusions 

In this work we have focussed on the feasibility of a precise measurement of the mass of 
a 120 GeV MSM Higgs boson by direct reconstruction, that would be attained at a high 
luminosity e^e~ future linear collider operating at a center-of-mass energy of ^/s = 500 
GeV. 

In common with previous studies, we have considered the Higgstrahlung process: 
e^e~ — > — > ZH, which constitutes the main Higgs production mechanism at ^/s < 500 
GeV for Mz < Mh < 2Mw. However, most results found in the literature have focussed 
on the Z leptonic decay channel: ZH — > i^£~H, due to the smaller background, the bet- 
ter intrinsic resolution on the Higgs invariant mass through the recoil mass distribution 
and the possibility of a measurement independent of assumptions about the Higgs decay 
modes. 

Here we have rather concentrated on the Z hadronic decay channel: ZH — > qqH, which 
has the bonus of a much larger statistics, but also the complications associated with 
a larger background, limitations of jet-clustering algorithms in properly reconstructing 
multi-jet final states, poor di-jet invariant mass resolution, etc. Much effort has been put in 
performing a "realistic simulation" by including irreducible and reducible backgrounds as 
well as realistic detector effects and reconstruction procedures. In order to fully exploit the 
possibilities of this decay channel, the use of sophisticated tools such as Neural Networks 
and kinematical fitting has been found to be important. As a result, the Higgs mass and 
Z-Higgs Yukawa coupling can be determined with a statistical accuracy exceeding that 
of the Z leptonic decay channel. For illustrative purposes only, the results of a recent 
study |l^ making use of ZH £~^i~H, £ = e, fi events at y/s = 350 GeV have been 
naively rescaled to y/s = 500 GeV and compared to the results from this work: 

ZH £+£-H, £ = e,fi: {AMH)stat ^ 160 MeV, ( ^3zzh \ _ ^ g^^^ 

V 9ZZH J Stat 
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ZH^qqH: (AMH).tat - 50 MeV, ( ^3zzh \ ^ o.7%, 

V 9ZZH J Stat 

assuming J" Cdt = 500 fb~^ of integrated luminosity. 
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Figure 3: Track 3D impact parameter significance for Z hadronic events at v^=200 GeV. 
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Figure 4: Preselection variables for the hadronic decay channel. Signal (solid) and background (dashed) 
have been normalized to the same number of events. The background prediction has been computed by 
adding all the different background contributions weighted according to their relative cross-section. 
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Figure 5: Flavor tagging variables: a), b) and c), used together with the preselection variables to train 
the Preselection NN, whose output is shown in d). Signal (solid) and background (dashed) have been 
normalized to the same number of events. The background prediction has been computed by adding all 
the different background contributions weighted according to their relative cross-section. 
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Figure 6: Selection NN variables for the hadronic decay channel (I). Signal (solid) and background 
(dashed) have been normalized to the same number of events. The background prediction has been 
computed by adding all the different background contributions weighted according to their relative cross- 
section. 
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Figure 7: Selection NN variables for the hadronic decay channel (II). Signal (solid) and background 
(dashed) have been normalized to the same number of events. The background prediction has been 
computed by adding all the different background contributions weighted according to their relative cross- 
section. 
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Figure 8: (a) Selection NN output. Signal (purple) and background (blue) have been normalized to the 
same number of events, (b) Efficiency and purity as a function of the cut in the Selection NN output. 
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Figure 9: Selection NN output. For the sake of clarity, the NN output is restricted to be larger than 0.2. 
The different contributions have been normalized to the same integrated luminosity. 
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Figure 10: Comparison between the raw H di-jet invariant mass and recoil mass distributions. A selection 
cut of SelNNO>0.9 has been applied but no background has been included. All distributions have been 
normalized to the same number of events. 
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Figure 11: Raw H di-jet invariant mass distribution. The contribution from events with missing energy 
below 2% shows a sharp and symmetric distribution around 120 GeV (dashed). No background has been 
included. 
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Figure 12: Set of kinematical variables to describe ZH production. The angular variables are generically 
denoted by f2 = {9, cj)). 
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Figure 13: Jet energy resolution for light-quark jets (from Z decays, blue solid histogram) and "b-jets" 
(from H decays, magenta solid histogram) in different parent quark energy ranges. The contribution from 
H W^W' ,t'^t~ decay channels is also explicited (magenta dashed histogram). 
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Figure 14: Effect of jet-mixing and energy losses on the bi-dimensional distribution of energy resolution 
for both H jets. Only H — > bb, cc, gg decay channels have been included. 
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Figure 15: Effect of initial state radiation (ISR), beamstrahlung (BS) and combinatorial background 
on the fitted Higgs invariant mass. ISR and BS are not taken into account in the kinematical fit. The 
shaded histogram shows the contribution from the combinatorial background. All distributions have been 
normalized to the same number of events. 
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Fi gure 16: True versus fitted total longitudinal momentum (left) and effective center-of-mass energy 
(right). 
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Figure 17: Comparison between raw reconstructed (magenta) and fitted Higgs invariant mass distribution 
(blue). The improvement by including ISR and BS in the kinematical fit is clearly observed. No background 
has been included. 
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Figure 18: Higgs invariant mass distribution corresponding to L = 10 fb~^ and including background 
(dashed): (a) raw reconstructed di-jet invariant mass, (b) di-jet invariant mass from the kinematical fit 
taking into account ISR and beamstrahlung. 
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